Automatic Segmentation and Labeling of Speech Corpus Based on HMM with Adaptation

نویسندگان

  • Donglai ZHU
  • Yu HU
  • Ren-Hua WANG
چکیده

In this article we advise to adopt the adaptive technique of acoustic model in the automatic segmentation and labeling of speech corpus. Since the precision of the data segmentation only based on speaker independent model is not good enough, we should transform the speaker independent model into the speaker dependent one. The training method leading to speaker dependent model needs a large amount of training data and will cost a lot of time, while the adaptive method can modify model parameters to match current speaker in a short time with a few training data and get comparatively precise segmentation results. And at the same time, in order to make the segmentation results more precise, we also combine the boundary adjustment based on the features of acoustics and phonetics and adopt an iterative procedure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Refined speech segmentation for concatenative speech synthesis

High accuracy phonetic segmentation is critical for achieving good quality in concatenative text to speech synthesis. Due to the shortcomings of current automated techniques based on HMM-based alignment or Dynamic Time Warping (DTW), manual verification and labeling are often required. In this paper we present a novel technique for automatic placement of phoneme boundaries in a speech waveform ...

متن کامل

Refined Speech Segmentation for Conc

High accuracy phonetic segmentation is critical for achieving good quality in concatenative text to speech synthesis. Due to the shortcomings of current automated techniques based on HMM-based alignment or Dynamic Time Warping (DTW), manual verification and labeling are often required. In this paper we present a novel technique for automatic placement of phoneme boundaries in a speech waveform ...

متن کامل

Towards A Phoneme Labeled Mandarin Chinese Speech Corpus

Phoneme level transcription of speech corpora is crucial to fundamental speech research and the increasingly interested detection-based automatic speech recognition. Currently, there is no existing phoneme-labeled Mandarin Chinese speech corpus. This paper presents our recent work towards development of such a corpus. Our goal is to label five hours of speech data selected from a Mandarin Chine...

متن کامل

Generation of Unit Databases for the Upc Text to Speech System

This paper describes a method for the generation of unit databases for concatenative text-to-speech systems. The method comprises the automatic segmentation and pitch synchronous labeling of the units and a selection procedure to extract the best instance per unit from a generic speech corpus. The segmentation is performed by an automatic HMM alignment. The introduction of the demiphone improve...

متن کامل

Some Aspects of ASR Transcription Based Unsupervised Speaker Adaptation for HMM Speech Synthesis

Statistical parametric synthesis offers numerous techniques to create new voices. Speaker adaptation is one of the most exciting ones. However, it still requires high quality audio data with low signal to noise ration and precise labeling. This paper presents an automatic speech recognition based unsupervised adaptation method for Hidden Markov Model (HMM) speech synthesis and its quality evalu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000